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Immunoglobulin Super family Domains and Fragments with 
Increased Solubility 

Field of the Invention 

The present invention relates to the modification of 
immunoglobuling superfamily (IgSF) domains and derivatives 
thereof so as to increase their solubility, and hence the 
yield, and ease of handling. 

Background to the Invention 

Small antibody fragments show exciting promise for 
use as therapeutic agents, diagnostic reagents, and for 
biochemical research. Thus, they are needed in large amounts, 
and the expression of antibody fragments, e.g. Fv, single-chain 
Fv (scFv), or Fab in the periplasm of E. coli (Skerra & 
Pluckthun 1988; Better et al . , 1988) is now used routinely in 
many laboratories. Expression yields vary widely, however, 
especially in the case of scFvs . While some fragments yield up 
to several mg of functional, soluble protein per litre and OD 
of culture broth in shake flask culture (Carter et al . , 1992, 
Pluckthun et al . 1996), other fragments may almost exclusively 
lead to insoluble material, often found in so-called inclusion 
bodies. Functional protein may be obtained from the latter in 
modest yields by a laborious and time-consuming refolding 
process. The factors influencing antibody expression levels 
are still only poorly understood. Folding efficiency and 
stability of the antibody fragments, protease lability and 
toxicity of the expressed proteins to the host cells often 
severely limit actual production levels, and several attempts 
have been tried to increase expression yields. For example, 



Knappik & Pluckthun (1995) have identified key residues in the 
antibody framework which influence expression yields 
dramatically. Similarly, Ullrich et al. (1995) found that point 
mutations in the CDRs can increase the yields in periplasmic 
antibody fragment expression. Nevertheless, these strategies 
are only applicable to a few antibodies. 

The observations by Knappik & Pluckthun (1995) 
indicate that optimizing those parts of the antibody fragment 
which are not directly involved in antigen recognition can 
significantly improve folding properties and production yields 
of recombinant Fv .and scFv constructs. The causes for the 
improved expression behavior lie in the decreased aggregation 
behavior of these molecules. For other molecules, fragment 
stability and protease resistance may also be affected. The 
understanding of how specific sequence modifications change 
these properties is still very limited and currently under 
active investigation. 

Difficulties in expressing and manipulating protein 
domains may arise because amino acids which are normally buried 
within the protein structure become exposed when only a portion 
of the whole molecule is expressed. Aggregation may occur 
through interaction of newly solvent-exposed hydrophobic 
residues originally forming the contact regions between 
adjacent domains. Leistler and Perham (1994) could show that a 
certain domain of glutathione reductase may be expressed 
separately from its neighboring domains, but the protein showed 
non-specific association in vitro forming multimeric protein 
species. The introduction of hydrophilic residues instead of 
exposed hydrophobic amino acids could decrease this aggregation 
tendency and thus stabilize this isolated domain. Both wild 
type and modified domains were exclusively found in inclusion 
bodies and had to be refolded. Although in vitro experiments 
contributed a lot to define various intermolecular 
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interactions, which drive folding processes, they are only of 
limited value in predicting the folding behaviour of different 
polypeptide chains in vivo (Gething & Sambrook, 1992) . Thus, 
Leistler and Perham do not teach or suggest how to increase 
expression yields of soluble protein domains. 

In the case of antibodies, two chains comprising 
several domains dimerize, each domain consisting of a b-barrel 
whose two b-sheets are held together by a disulphide bond, 
forming the so-called immunoglobulin fold. Two domains, one 
variable domain (VL) and one constant domain {CD are adjacent 
along the longitudinal axis in the light chain (VL-CL) , and 
four domains, one variable domain (VH) and three constant 
S domain {CHI to CH3) are adjacent along the longitudinal axis in 

the heavy chain (VH-CH1-CH2-CH3 ) . In the dimer formed by 
chains a and b, two such domains associate laterally: VL a with 
VH a , CL a with CHl a , VL b with VH b , CL b with CHl b CH2 a with CH2 b 
e and CH3 a with CH3 b . In WO 92/01787 (Johnson et al . , 1992), it 

u is taught that isolated single domains, e.g. VH, can be 

modified in the former VL/VH interface region by exchanging 
20 hydrophobic residues by hydrophilic ones without changing the 

iC specificity of the parent • domain . The rationale for WO 

92/01787 was the assumption that exposed hydrophobic residues ^ 
might lead to non-specific binding, interaction with- surfaces 
and decreased stability. Data for increase in binding 
25 specificity was given, but increase in expression level was not 

shown. Furthermore, WO 92/01787 would not be applicable to any n^V^ 
antibody fragment containing .the complete antigen binding site, y^Y 
as it must contain VL and VH ? r> 

In the case of T cell receptors, two chains (a and b) 
30 dimerize, each consisting of a variable (V) and a constant (C) 

domain with the immunoglobulin fold, and one transmembrane 
domain. In each chain, the variable and constant domains are 
adjacent along the longitudinal axis in the chains (Va-Ca; Vb- 



Cb) and associate laterally with the corresponding domains of 
the second chain (Va-Vb; Ca-Cb) . 

Various other molecules of the immunoglobulin 
superfamily, such as CD2, CD4, CD16, CD22, comprise only one 
chain, wherein two or more domains (variable and/or constant) 
with the immunoglobulin fold are adjacent along the 
longitudinal axis in the chains. 

The present inventors have found that expression 
problems are largely associated with a part of the molecule 
that has hitherto not been regarded relevant for expression 
studies and which comprises the interface between adjacent 
domains within an immunoglobulin chain. This surprising 
finding forms the basis of the present invention, which 
provides a general solution to the problems associated with 
production of domains or fragments of the immunoglobulin 
superfamiliy (IgSF), especially antibody fragments, which 
exhibit poor solubility or reduced levels of expression. 

Detailed Description of* the Invention 

In addition to lateral interactions between domains 
of different chains described above, there are well documented 
contacts between adjacent domains within- J^ndiyidual . chains 
along the longitudinal axis. For example, in the case of an ' 
antibody (Lesk & Chothia, 1988) , the "bottom" of VL makes, * 
contact with the "top" of CL, and, in a similar manner there- 
are contacts between VH and CHI. The contacts at these intern- 
domain interfaces are probably essential for the compact 
arrangement of the Fab fragment, and, as is typical for such 
contacts, are at least partially hydrophobic in nature (Lesk .& 
Chothia, 1988) . 

The basis of the present invention is the surprising 
finding that the solubility (and hence the yield) of antibody 
fragments comprising at least one domain can be dramatically 



increased by decreasing the hydrophobicity of former interfaces 
at the "end" of said domain, where it would normally adjoin a 
second domain within a chain in a larger antibody fragment or . 
full antibody. This is surprising and could not have been 
predicted from the prior art (WO 92/01787), because the size of 
the longitudinal interface, for example, in a scFv fragment, is 
much smaller than that between VH and VL, and therefore, the 
amino acids which make up the interfaces between VH and CHI or 
between VL and CL in a Fab fragment represent a much smaller 
proportion of the total surface area of the scFv molecule, and 
would accordingly be expected to play less of a role in 
determining the physical properties of the molecule. 

The present invention has the additional advantage 
that because the alterations effected in the molecules that 
lead to said decreased hydrophobicity of former interfaces are 
located at the most distant part of the domain from the-CDRs, 
applying the invention is unlikely to have a deleterious effect 
on the binding properties of the molecule. This is not the 
case in WO 92/01787, where at least one modification is close 
to the CDRs and may therefore be expected to have an impact on 
antigen binding. Furthermore, WO 92/01787 cannot be applied to 
VL/VH heterodimers, as explained above. 

The present invention relates to a modified 
immunoglobulin superfamily (IgSF) domain or fragment which 
differs from a parent IgSF domain or fragment in that the 
region which comprised or would comprise the interf ace with a 
second domain adjoined to said parent IgSF domain or fragment 
within the protein chain of a larger IgSF fragment or a full 
IgSF protein, and which is exposed in said parent IgSF domain 
or fragment in the absence of said second domain, is made more 
hydrophilic by modification. 

In the context of the present invention, the term 
immunoglobulin superfamily { IgSF) domain refers to those parts 



of members of the immunoglobulin superfamily which are 
characterized by the immunoglobulin fold, said superfamiliy 
comprising the immunoglobulins or antibodies, and -various other 
proteins such as T-cell receptors or integrins . The term IgSF 
^fr^gmen? refers to any portion of a member of the 
M-inmunoglobulin superfamiliy, said portion comprising at least 
one IgSF^domain^O The term adjoining domain refers to a domain 
which is contiguous with a first domain. The term(lnterfaceT 
refers to a region of said first domain where interaction with 
the adjoining domain takes place. -The terms hydrophobic and 
hydrophilic refer to a physical property of amino acids, which 
can be estimated quantitatively: tabulated values of 
hydrophobicity for the twenty naturally-occurring amino acids 
are available (Nozaki & Tanford, 1971; Casari & Sippl, 1992; 
Rose & Wolfenden, 1993) . 

The residues to be modified can be identified in a 
number of ways. For example, in one way, the solvent 
accessibilities (Lee & Richards, 1971) of hydrophobic interface 
residues in said parent IgSF fragment compared to said larger 
IgSF fragment or full IgSF protein are calculated, with high 
accessibilities indicating highly exposed residues. In a 
second way, the number of van der Waals contacts of hydrophobic 
interface residues in said larger IgSF fragment or full IgSF 
protein is calculated. A large number for a residue of said 
parent domain indicates that it will be highly solvent-exposed 
in the absence of an adjoining domain. There are other ways of 
calculating or determining residues to be modified according to 
the present invention, and one of ordinary skill in the art 
will be able to identify and practice these ways. 

By analyzing computer models of said parent IgSF 
fragment, interactions of said highly exposed residues within 
the fragment can be identified. Such interactions could 
stabilize the parent IgSF fragment. Residues, which interact 



closely with other hydrophobic residues and which can be 
identified by anyone of ordinary skill in the art, should not 
preferentially be mutated. 

The modification referred to above may be effected in 
a number of ways which are well known to one /Skilled in the 
art. In a preferred embodiment, the modification is a 
substitution of one or more amino acids," at ' the exposed 
interface, identified as described- above,' with amino acids 
which are more hydrophilic. Alternatively, one or more amino 
acids can be inserted in said interface, or one or more amino 
acids can be deleted from said interface, so as to increase its 
overall hydrophilicity . Furthermore, any combination of 
substitution, insertion and deletion can be effected to reduce 
the hydrophobicity of said interface. Also comprised by the 
present invention is the possibility that the substitution or 
insertion comprises amino acids with a relatively high 
hydrophobicity value, or that the deletion comprises amino 
acids with relatively low hydrophobicity value, as long as the 
overall hydrophilicity value is increased in said interface 
region. Modifications such as substitution, insertion and 
deletion can be effected using standard methods which are well 
known to practitioners skilled in the art. By way of example, 
the skilled artisan can use either site-directed or PCR-based 
mutagenesis (Ho et al . , 1989; Kunkel et al . , 1991; Trower, 
1994; Viville, 1994), or total 'gene synthesis (Prodromou & 
Pearl, 1992) to effect the necessary modif icat ion ( s ) . In a 
further embodiment, the mutations may be obtained by random 
mutagenesis and screening of random mutants, using a suitable 
expression and screening system (see, for example, Stemmer, 
1994; Crameri et al . , 1996). 

In a preferred embodiment, the amino acid(s) which 
replace (s) the more hydrophobic amino acids include Asn, Asp, 
Arg, Gin, Glu, Gly, His, Lys, Ser, and Thr. These are among 



the more hycirophilic of the 20 naturally-occurring amino acids, 
and have proven to be particularly effective in the application 
of the present invention. Said amino acids, alone or in 
combination, or in combination with other amino acids, can also 
be used to form the above mentioned insertion which makes the 
interface region more hydrophilic. 

The parent IgSF domain or fragment referred to above 
can be one of several different types. In a preferred 
embodiment, said parent domain or fragment is derived from an 
antibody. In one embodiment, said parent antJ : body fragment 
comprises an (fvyfragment . In this context, the term ^Fv 
fragment refers to a complex comprising the VL (variable light) 
and VH (variable heavy) portions* of the antibody molecule. In 
a further embodiment, the parent antibody fragment may be a 
single-chain Fv fragmentise FvJ^Bird et al . , 1988; Huston et 
al . , 1988), in which the VL and VH chains are joined, in either 
a VL-VH, or VH-VL orientation, by a peptide linker. In yet a 
further embodiment, the parent antibody fragment may be an Fv 
fragment stabilized by an inter-domain disulphide bond. This 
is a structure which can be made by engineering into each chain 
a single cysteine residue," wherein said cysteine residues from 
two chains become linked through oxidation to form a disulphide 
(Glockshuber et al . , 1990; Brinkmann et al . , 1993). 

In a most preferred embodiment, the A'nTterXace^ region 
of the variable domains mentioned above comprises residues 9, 
10, 12, 15, 39, 40, 41', 80, 81, 83, 103, 105, 106, 106A, 107, 
108 for VL, and residues 9, 10, 11, 13, 14, 41, 42, 43, 84, 87, 
89/ 105, 108, 110, 112, 113 for VH according' to the Kabat 
numbering system (Kabat et al . , 1991). Said numbering system 
was established for the sequences of .whole antibodies, but can 
be adapted correspondingly to describe the sequences of 
isolated antibody domains or antibody fragments, even in the 
case of scFv fragments, where VL and VH are connected via a 
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peptide linker, and where the protein sequence from N- to C- 
terminus has to be numbered differently. This means that the 
Kabat numbering system is used in the present invention as a 
sequence description relative to the existing data on antibody 
sequences, not as an absolute description of actual positions 
within the antibody fragment sequences of interest. 

In a further embodiment, said parent antibody 
fragment comprises a Fab fragment. In this context, the term 
Fab refers to a complex comprising the VL-CL (variable and 
constant light) and VH-CH1 (variable and first constant heavy) 
portions of the antibody molecule, and the term interface 
region refers to a region in the first constant domain of the 
heavy chain (CHI) which is, or would be adjoined to, the CH2 
domain in a larger antibody fragment or full antibody. 

In a still further embodiment, said parent IgSF 
fragment is a fusion protein of any of said domains or 
fragments and another protein domain, derived from an antibody 
or any other protein or peptide. The advent of bacterial 
expression of antibody fragments has opened the .way to the 
construction of proteins comprising fusions between antibody 
fragments and other molecules. A further embodiment of the 
present invention relates to such fusion proteins by providing 
for a DNA sequence which encodes both the modified IgSF domain 
or fragment, as described above, as well as an additional 
moiety. Particularly preferred are moieties which have a 
useful therapeutic function. For example, the additional 
moiety may be a toxin molecule which is able to kill cells 
(Vitetta et al., 1993). There are numerous examples of such 
toxins, well known those skilled in the art, such as the 
bacterial toxins Pseudomonas exotoxin A, and diphtheria toxin, 
as well as the plant toxins ricin, abrin, modeccin, saporin, 
and gelonin. By fusing such a toxin to an antibody fragment, 
the toxin can be targeted to, for example, diseased cells, and 
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thereby have a beneficial therapeutic effect. Alternatively, 
the additional moiety may be a cytokine, such as IL-2 
(Rosenberg & Lotze, 1986), which has a particular effect (in 
this case a T-cell proliferative effect) on a family of cells. 
In a further preferred embodiment, the additional moiety is at 
least part of a surface protein which may direct the fusion 
protein to the surface of an organism, for example, a cell or a 
phage, and thereby displays the IgSF partner. Preferably, the 
additional moiety is at least part of a coat protein of 
filamentous bacteriophages, most preferably of the genelll 
protein. In a further embodiment, the additional moiety may 
confer on its IgSF partner a means of detection and/or 
purification. For example, the fusion protein could comprise 
the modified IgSF domain or fragment and an enzyme commonly 
used for detection purposes, such as alkaline phosphatase 
(Blake et al., 1984). There are numerous other moieties which 
can be used as detection or purification tags, which are well 
known to the practitioner skilled in the art. Particularly 
preferred are peptides comprising at least five histidine 
residues (Hochuli et al., 1988), which are able to bind to 
metal ions, and can therefore be used for the purification of 
the protein to which they are fused (Lindner et al . , 1992). 
Also provided for by the invention are additional moieties such 
as the commonly used c-myc and FLAG tags (Hopp et al . , 1988; 
Knappik & Pluckthun, 1994) . 
/ ^^-^^By^nc^neering one or more fused additional domains, 
^"^IgSF domains y6r fragments can be assembled into larger 
y^tiolecules wnich also fall under the scope of the present 
/^inventioj/ To the extant that the physical properties of the 
^ IgSF domain or fragment determine the characteristics of the 

assembly, the present/ invention provides a means of increasing 
jr the solubility of sufch larger molecules. For example, mini- 
V7 antibodies (Pack, 3J994) are dimers comprising two antibody 



fragments, each fused t£ a self-associating dimerization 
domain. Dimerization domains which are particularly preferred 
include those derived from a leucine zipper (Pack & Pliickthun, 
1992) or helix-turn-ht/lix motif (Pack et al., 1993). 

All of the above embodiments of the present invention 
can be effected using standard techniques of molecular biology 
known to anyone skilled in the art. 

The compositions described above may have utility in 
any one of a number of settings. Particularly preferred are 
diagnostic and therapeutic compositions. 

The present invention also provides methods for 
making the compositions and compounds comprised therein 
described above. Particularly preferred is a method comprising 
the following steps: 

i) analyzing the interface region of an IgSF domain for 
hydrophobic residues which are solvent-exposed using either a 
solvent-accessibility approach (Lee & Richards, 1971), 
analysis of van der Waals interactions in the interface 
region, or similar methods which are well known to one 
skilled in the art, 

ii) identifying one or more of the hydrophobic residues to be 
substituted by more hydrophilic residues, or one or more 
positions where hydrophilic residues or amino acid stretches 
enhancing the overall hydrophilici ty of the interface region 
can be inserted into said interface region, or one or more 
positions where hydrophobic residues or amino acid stretches 
enhancing the overall hydrophobicity of the interface region 
can be deleted from said interface region, or any combination 
of said substitutions, said insertions, and said deletions to 
give one or more mutants of said parent IgSF domain, 



iii) preparing DNA encoding mutants of said IgSF domain, 
characterized by the changes identified in ii) , by using e.g. 
conventional mutagenesis or gene synthesis methods, said DNA 
being prepared either separately or as a mixture, 

iv) introducing said DNA or DNA mixture in a vector system 
suitable for expression of said mutants, 

v) introducing said vector system into suitable host cells 
and expressing said mutant or mixture of mutants, 

vi) identifying and characterizing mutants which are obtained 
in higher yield in soluble form, and 

vii) if necessary, repeating steps iii) to vi) to increase the 
hydrophilicity of said identified mutant or mutants further. 

The host referred to above may be any of a number 
commonly used in the production of heterologous proteins, 
including but not limited to bacteria, such as E. coli (Ge et 
al, 1995), or Bacillus subtilis (Wu et al . , 1993), fungi, such 
as yeasts (Horwitz et al . , 1988; Ridder et al . , 1995) or 
filamentous fungus (Nyyssonen et al . , 1993), plant cells 
(Hiatt, 1990, Hiatt & Ma, 1993; Whitelam et al . , 1994), insect 
cells {Potter et al . , 1993; Ward et al . , 1995), or mammalian 
cells (Trill et al . , 1995). 

The invention also relates to a method for the 
production of an IgSF domain or fragment of the invention 
comprising culturing a host cell of the invention and isolating 
said domain or fragment . 

The invention is now demonstrated by the following 
examples, which are presented for illustration only and are not 
intended to limit the scope of the invention. 



Examples 



i) Abbreviations 

Abbreviations are defined as follows: CDR: 
complementarity determining region; dsFv: disulf ide-linked Fv 
fragment; IMAC: immobilized metal ion affinity chromatography; 
IPTG: isopropyl-b-D-thiogalactopyranoside; i/s: ratio 
insoluble/soluble; H(X): heavy chain residue number X; L(X): 
light chain residue number X; NTA: nitrilo-triacetic acid; 
OD 55Q : optical density at 550 nm; PDB: protein database; scFv: 
single-chain Fv fragment; SDS-PAGE: sodium dodecyl sulfate 
polyacrylamide gel electrophoresis; v/c: variable/constant; wt: 
wild type. 



ii) Material and Methods 

(a) Calculation of solvent accessibility 

Solvent accessible surface areas for 30 non-redundant 
Fab fragments and the Fv fragments derived from these by 
deleting the constant domain coordinates from the PDB file were 
calculated using the latest version, as of March 1996, of the 

program NACCESS 

>-efrttpT //www.biu^^ based on 

the algorhithm described by Lee & Richards (1971). 

(b) scFv gene synthesis 

The single-chain Fv fragment (scFv) in the 
orientation V L -linker-V H of the antibody 4-4-20 (Bedzyk et al . , 
1990) was obtained by gene synthesis (Prodromou and Pearl, 
1992) . The V L domain carries a three-amino acid long FLAG tag 



(Knappik and Pliickthun, 1994) . We have used two different 
linkers with a length of 15 (Gly4Ser) 3 and 30 amino acids 
(Gly4Ser)6, respectively. The gene so obtained was cloned into 
a derivative of the vector pIG6 (Ge et al . , 1995). The mutant 
antibody fragments were constructed by site-directed 
mutagenesis (Kunkel et al., 1987) using single-stranded DNA and 
up to three oligonucleotides per reaction. 



(c) Expression 

Growth curves were obtained as follows: 20 ml of 2xYT 
medium containing 100 ug/ml ampicillin and 25 ug/ml 
streptomycin were inoculated with 250 ul of an overnight 
culture of E. coli JM83 harboring the plasmid encoding the 
respective antibody fragment and incubated at 24.5°C until an 
OD 55Q of 0.5 was reached. IPTG (Biomol Feinchemikalien GmbH) 
was added to a final concentration of 1 mM and incubation was 
continued for 3 hours. The OD was measured every hour, as was 
the b-lactamase activity in the culture supernantant to 
quantify the degree of cell leakiness . Three hours after 
induction an aliquot of the culture was removed and the cells 
were lysed exactly as described by Knappik and Pliickthun 
(1995) . The b-lactamase activity was measured in the 
supernatant, in the insoluble and in the soluble fraction. The 
fractions were assayed for antibody fragments by reducing SDS- 
PAGE, with the samples normalized to OD and b-lactamase 
activity to account for possible plasmid loss as well as for 
cell leakiness. The gels were blotted and immunostained using 
the FLAG antibody Ml (Prickett et al., 1989) as the first 
antibody, an Fc-specific anti-mouse antiserum conjugated to 
horseradish peroxidase (Pierce) as the second antibody, using a 
chemoluminescent detection assay described elsewhere (Ge et 
al. , 1995) . 



(d) Purification 



Mutant scFv fragments were purified by a two-column 
procedure. After French press lysis of the cells, the raw E. 
coli extract was first purified by IMAC (Ni-NTA superflow, 
Qiagen) (20 mM HEPES , 500 mM NaCl, pH 6.9; step gradient of 
imidazole 10, 50 and 200 mM) (Lindner et al., 1992) and, after 
dialyzing the IMAC eluate against 20 mM MES pH 6.0, finally 
purified by cation exchange chromatography (S-Sepharose fast 
flow column, Pharmacia) (20 mM MES, pH 6.0; salt gradient 0-500 
mM NaCl) . Purity was controlled by Coomassie stained SDS-PAGE . 
The functionality of the scFv was tested by competition ELISA. 

Because of its very poor solubility in the 
'periplasmic system, the wt 4-4-20 was expressed as cytoplasmic 
inclusion bodies in the T7-based system (Studier & Moffatt, 
1986; Ge et al . , 1995). The refolding procedure was carried 
out as described elsewhere (Ge et al., 1995). For 
purification, the refolding solution (2 1) was loaded over 10 h 
without prior dialysis onto a fluorescein affinity column, 
followed by a washing step with 20 mM HEPES, 150 mM NaCl, pH 
7.5. Two column volumes of 1 mM fluorescein (sodium salt, 
Sigma Chemicals Co.) pH 7 . 5 were used to elute all functional 
scFv fragment. Extensive dialysis (7 days with 12 buffer 
changes) was necessary to remove all fluorescein. All purified 
scFv fragments were tested in gel filtration (Superose-12 
column, Pharmacia SMART-System, 20 mM HEPES, 150 mM NaCl, pH 
7.5) . 

(e) Kd determination by fluorescence titration 

The concentrations of the proteins were determined 
photometrically using an extinction coefficient calculated 
according to Gill and von Hippel (1989) . Fluorescence 



titration experiments were carried out by taking advantage of 
the intensive fluorescence of fluorescein. Two ml of 20 rnM 
HEPES, 150 mM NaCl, pH 7 . 5 containing 10 or 20 nM fluorescein 
were placed in a cuvette with integrated stirrer. The 
excitation wavelength was 485 nm, emission spectra were 
recorded from 490 to 530 nm. Purified scFv (in 20 mM HEPES, 
150 mM NaCl, pH 7.5) was added in 5 to 100 ul aliquots, and' 
after a 3 min equilibration time a spectrum was recorded. All 
spectra were recorded at 20°C. The maximum of emission at 
510 nM was used for determining the degree of complexation of 
scFv to fluorescein, seen as quenching as a function of the 
concentration of the antibody fragment. The K D value was 
determined by Scatchard analysis. 

(f) Equilibrium denaturation measurement 

Equilibrium denaturation curves were obtained by 
denaturation of 0.2 uM protein in HEPES buffered saline (HBS) 
buffer (20 mM HEPES, 150 mM NaCl, 1 mM EDTA, pH 7.5) and 
increasing amounts of urea (1.0-7.5 M; 20 mM HEPES, 150 mM 
NaCl, pH 7.4; 0.25 M steps) in a total volume of 1.7 ml. After 
incubating the samples at 10°C for 12 hours and an additional 
3 hours at 20°C prior to measurements, the fluorescence spectra 
were recorded at 20°C from 320-360 nm with an excitation 
wavelength of 280 nm. The emission wavelength of the 
fluorescence peak shifted from 341 to 347 nm during 
denaturation and was used for determining the fraction of 
unfolded molecules. Curves were fitted according to Pace 
(1990) . 



(g) Thermal denaturation 



For measuring the thermal denaturation rates, 
purified scFv was dissolved in 2 ml HBS buffer to a final 
concentration of 0.5 uM. The aggregation was followed for 2.5 
h at 40°C and at 44°C by light scattering at 400 nm. 

iii) Results 

(a) Comparison of known antibody sequences 

Compared to other domain/domain interfaces in 
proteins, the interface between immunoglobulin variable and 
constant domains is not very tightly packed. A comparison of 
30 non-redundant Fab structures in the PDB database showed that 
between the light chain variable and constant domain an area of 
410 ± 90 per domain is buried, while the heavy chain 
variable and constant domains interact over an area of 
710 ± 180 . Some, but not all of the interface residues are 
hydrophobic, predominantly aliphatic. Generally, sequence 
conservation of the residues contributing to the v/c domain 
interface is not particularly high. Still, the v/c domain 
interface shows up as a marked hydrophobic patch on the surface 
of an Fv fragment (Fig. 1) . 

Solvent accessible surface areas for 30 non-redundant 
Fab fragments and their corresponding Fv fragments (derived 
from the Fab fragment by deleting the constant domain 
coordinates from the PDB file) were calculated using the 
program NACCESS (Lee & Richards, 1971) . Residues participating 
in the v/c domain interface were identified by comparing the 
solvent-accessible surface area of each amino acid side chain 
in the context of an Fv fragment to its accessible surface in 
the context of an Fab fragment. Figure 2 shows a plot of the 
relative change in side chain accessibility upon deletion of 
the constant domains as a function of sequence position. 



Residues which show a significant reduction of side chain 
accessibility are also highlighted in the sequence alignment. 
To assess sequence variability in the positions identified in 
Figure 2, the variable domain sequences collected in the Kabat 
database (status March 1996) were analyzed (Table 1) . Of the 
15 interface residues identified in the Vl domain of the 
antibody 4-4-20 (Fig. 1 and Table 1), L9(leu), L12 (pro) , 
L15(leu), L40{pro), L83(leu), and L106(ile) are hydrophobic and 
therefore candidates for replacement. Of the 16 interface 
residues in the Vr domain, Hll (leu) , H14 (pro) , H41 (pro) , 
H84(val), H87(met) and H89(ile) were identified as possible 
candidates for substitution by hydrophilic residues in the scFv 
fragment of the antibody 4-4-20 (Fig. 1 and Table 1) . 

Not all of these hydrophobic residues are equally 
good candidates for replacements, however. While residues 
which are hydrophobic in one particular sequence but 
hydrophilic in many other sequences may appear most attractive, 
the conserved hydrophobic residues listed in Table 1 have also 
been investigated, since the evolutionary pressure which kept 
these conserved residues acted on the Fab fragment within the 
whole antibody, but not the isolated Fv portion. In this 
study, we did not replace the proline residues since pro L40 
and pro H41 form the hairpin turns at the bottom of the 
framework II region, while the conserved Vl cis-proline L8 and 
proline residues H9 and H14 determine the shape of framework I 
of the immunoglobulin variable domains. 

Excluding prolines, this leaves residues L9 (leu in 
4-4-20, ser in most kappa chains), LIS (leu, usually 
hydrophobic) , L83 (leu, usually val or phe) and L106 (lie, as 
in 86% of all kappa chains) in the Vl domain and Hll (leu as in 
60% of all heavy chains), H84 (val, in other Vh domains 
frequently ala or ser), H87 (met, usually ser) and H89 (ile, 
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most frequently val) in V H as possible candidates for 
replacement in the 4-4-20 scFv fragment. 

(b) Mutations in the 4-4-20 scFv 

For the 4-4-20 scFv fragment some of the crucial 
residues identified in the sequence analysis described above 
are already hydrophilic, but nevertheless 9 residues are of 
hydrophobic nature (including prol2 in the light chain) (Table 
1). We chose three residues for closer analysis by mutations. 

LeulS in Vl is a hydrophobic amino acid in 98 % of 
a all kappa chains (Table 1) . Leull is conserved in V H (Table 1) 

Jfj and is involved in v/c interdomain contacts (Lesk & Chothia, 

y 1988) . In contrast, valine occurs very infrequently at 

m 15 position H84; mainly found at this position are threonine or 

J serine and alanine (Table 1). As can be seen in Figure 1, 

C val84 is contributing to a large hydrophobic patch at the newly 

q exposed surface of V H . All three positions were mutated into 

M acidic residues, and Lll was also changed to asparagine (Table 

t 20 2). 

The scFv fragment was tested and expressed with two 
different linkers, the 15-mer linker (Gly 4 Ser) 3 (Huston et al., 
1995) and the same motif extended to 30 amino acids (Gly4Ser) 6 . 
All mutations were tested in both constructs. The in vivo 
25 results of the different mutations on solubility were 

identical, and therefore only the results of the 30-mer linker 
are described in more detail. The periplasmic expression 
experiments were carried out at 24.5°C, and all constructs were 
tested for soluble and insoluble protein by immunoblotting. 
The ratio of insoluble to soluble (i/s) protein was determined 
for every mutant. In Figure 3 A-D, insoluble (lane 1) and 
soluble (lane 2) fractions of the wt scFv are shown. Nearly no 
soluble material occurs in periplasmic expression, which is 
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consistent with previous reports of Bedzyk et al . (1990) and 
Denzin et al. (1991), who described earlier that the 
periplasmic expression of the wt scFv leads mainly to 
periplasmic inclusion bodies. 

The single point mutation L15E in V L (Flul) shows no 
effect on the ratio i/s when compared with the wt (Fig. 3A, 
lane 3, 4). Mutating leu at position 11 in the heavy chain to 
asparagine (Flu2) also shows nearly no effect compared to the 
wt, whereas the subtitution with aspartic acid (Flu3) changes 
the i/s ratio to more soluble protein, but still this effect is 
not very dramatic. In contrast, the point mutation at position 
84 (Flu4, Fig. 3B, lane 3, 4 and Fig 3D, lane 3, 4) had a 
dramatic influence on the solubility of the scFv fragment of 
the antibody 4-4-20. The ratio i/s is changed to about 1-1 
resulting in a 25-fold increase of soluble protein compared to 
the wt. 

The combination of V84D with L11N or L11D (Flu5, 
Flu6) also changes the ratio i/s compared to the wt, but this 
ratio compared to V84D alone is not improved further (Fig. 3B) . 
Interestingly, the combination of Flu5 with the light chain 
mutation at position 15 (Flu9) leads to less soluble material 
(Fig. 3C lane 7,8) than Flu5 itself (Fig. 3B, lane 5, 6). The 
negative influence of the L15E mutations can also be seen in 
Flu8 (Fig. 3C, lane 5, 6) compared with Flu3 (Fig. 3A, .lane 7, 
8). In Fig. 3D the comparison of the wt (lane 1, 2 and 5, 6) 
and Flu4 (lane 3, 4 and 7, 8) is shown in both the 15-mer and 
the 30-mer construct. 

The negative effect of L15E can be rationalized by 
looking at a model of the 4-4-20 scFv fragment. L15 is forming 
a hydrophobic pocket together with residues A80, L83, and L106 
Apparently, LIS stabilizes the scFv fragment by hydrophobic 
interactions with its closest neighbours. Thus the exchange 
L15E for making the scFv fragment more hydrophilic and more 
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soluble is made at the expense of the fragment stability. The 
analysis of hydrophobic interactions within a fragment should 
thereby by used to choose the solvent-exposed residues to be 
mutated in the case of any other antibody fragment. 
Combinations of various serine mutations in VH led to further 
improvements in the i/s ratio. .The mutants FH15 (V84S, M87S, 
I89S) and FH20 (L11S, V84S, M87S, I89S) both showed more than 
70% of soluble protein in immunoblots (data not shown) . 
The negative effect of L15E 

(c) Functional expression and purification 

^ The oligomerization of scFv fragments as a function 

fli 

Q of linker length has been investigated previously. A 

ft/ 15 continuous decrease in the amount of dimer and multimer 

i; = formation as a function of linker length has been reported 

G (Desplancq et al . , 1994; Whitlow et al . , 1994). While the 

P (Gly4Ser) 3 linker has been shown to lead to monomeric scFvs in 

N many cases in the Vh~Vl direction, this is often not the case 

Li 

lj= 20 in the Vl-Vh direction. This is caused by an asymmetry in the 

Vl/Vh arrangement, leading to a longer distance between the end 
~ of Vh and the N-terminus of Vl than between C-terminus of Vl 

and N-terminus of Vr {Huston et al . , 1995). Consequently, a 
linker of identical length may lead to different properties of 
25 the resulting molecules. 

Since we have chosen to use the minimal pertubation 
FLAG (Knappik & Pliickthun, 1994) at the N-terminus of Vl in our 
constructs and thus the VL-linker-Vn orientation, we have 
investigated the use of longer linkers. In the periplasmic 
30 expression in E . coli no difference between the 15-mer and the 

30-mer linker in the corresponding mutants is visible (Fig. 
3D) , but when we attempted to purify the two Flu4 scFvs with 
long and short linker, a big discrepancy between the two 



constructs was found. The purification of the Flu4 mutant 
(V84D) with the 15-mer linker leads to very small amounts of 
partially purified protein (about 0.015 mg per liter and OD; 
estimated from SDS-PAGE after IMAC purification) , whereas the 
30-mer linker construct gives about 0.3 mg per liter and OD of 
highly pure functional protein. All mutants with 30-mer linker 
were tested in gel filtration and found to be monomeric {data 
not shown) . 

For further in vitro characterization five mutants 
were purified with the 30-mer linker, V84D (Flu4), V84D/L11D 
(Flu6), L11D (Flu3), and the serine mutants FH15 and FH20 (see 
iii (b) ) . A two-step chromatography, first using IMAC and then 
cation-exchange chromatography, led to homogeneous protein. 
The i/s ratio of the antibody fragments (Fig. 3) was also 
reflected in the purification yield of functional protein. The 
highly soluble mutant Flu4 (V84D) (Fig. 3B lane 3, 4) yielded 
about 0.3 mg purified and functional protein per liter and OD, 
Flu6 (L11D/V84D) (Fig. 3B lane 7, 8) yielded about 0.25 mg per 
liter and OD and Flu3 (less soluble material on the blot in 
Fig. 3A lane 7, 8) yielded 0.05 mg per liter and OD. The 
serine mutants FH15 and FH2 0 yielded 0 . 3 mg and 0.4 mg per 
liter and OD, respectively. The wt scFv of the antibody 4-4-20 
did not give any soluble protein at all in periplasmic 
expression with either linker, and it was therefore expressed 
as cytoplasmic inclusion bodies, followed by refolding in vitro 
and fluorescein affinity chromatography. The refolded wt scFv 
was shown by gel filtration to be monomeric with the 30-mer 
linker (data not shown) . 

(d) Biophysical properties of the mutant scFvs 

Since we changed amino acids which are conserved, it 
cannot be excluded that changes at these positions may be 
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transmitted through the structure and have an effect on the 
binding constant, eveVi though they are very far from the 
binding site (Chatelller et al . , 1996). To eliminate this 
possibility, we determined the binding constant of the mutants 
Flu3, Flu4, Flu6 and thte wt scFv. Fluorescence titration was 
used to determine Kd involution by using the quenching of the 
intrinsic fluorescence of fluorescein when it binds to the 
antibody. The fluorescence quenching at 510 nm was measured as 
a function of added scFv.\ The Kd values (Table 3 and Fig. 4) 
obtained for all three mutknt scFvs and the wt scFv are very 
similar and correspond vertf well to the recently corrected. Kd 
of the monoclonal antibody tt-4-20 (Miklasz et al . , 1995). 

To determine whetlier the-mutations had an influence 
/on the thermodynamic stability o4 the protein we determined the 
equilibrium unfolding curves W urea denaturation . V84D mutant 
and the wt scFv were used forYthis analysis, and in Figure 5 an 
overlay plot is shown. The midpoint of both curves is at 4.1 M 
urea. Both curves were fitt£ed\by an algorhithm for a two-state 
model described by Pace ( 19/90 )1-J^ut the apparent small 
difference between the V84D mutknti and the wt scFv is not of 
statistical significance. \ 

Aggregation of folding! intermediates could be another 
explanation for the different in\vivo results between the 
mutant scFvs and the wt scFv (Fig\3) . In the periplasm of E. 
coli f the protein concentrations aVe assumed to be rather high 
(van Wielink & Duine, 1990) and thd aggregation effects could 
thus be pronounced. In order to estimate the aggregation 
behavior in vitro, we have measured \the thermal aggregation 
rates at different temperatures. In\ Figure 6 it is clearly 
seen that the wt scFv is significantly aggregating already at 
44 °C, whereas the mutant V84D tends to aggregate more slowly. 
The wt scFv is thus clearly more aggregation prone than the 
mutant scFv. This is very similar to the observations made 
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with different mutations Ion the antibody McPC603 (Knappik and 
Pluckthun, 1995), where nb correlation was found between 
equilibrium denaturation durves and expression behavior, but a 
good correlation was found\with the thermal aggregation rates. 



Figures and Tables \ 

Figure 1: Space-filling \ representation of the Fv fragment 

of the antibody \ 
4-4-20 \ 

Figure 2: Variable/constant domain interface residues for 

Vl (2a) and Vh (2b) . Hor 30 non-redundant Fab fragments 
taken from the Brookhafceft Databank, the solvent 
accessible surface ol the amino acid side chains was 
calculated in the context of an Fv and of an Fab 
fragment. The plot/ shovls the relative reduction in 
accessible surfade upon! contact with the constant 
domains (overlay/ pl0T>f or all 30 Fv fragments). In the 
sequence alignment, residues contributing to the v/c 
interface are highlighted. The symbols indicate the 
relative reduction of solvent accessible surface upon 
removing the constant domkins (symbols: no symbol < 1%; 
1 < 20%; n < 40%; s < 60%\ t < 80%, and u 3 80%) . 
Circles indicate those positions which are further 
analyzed (see Table 1) . \ 

Figure 3: Western blots showing the insoluble (i) and 

soluble (s) fractions of ceil extracts, prepared as 
described in Material and Methods, expressing the scFv 
fragments of the antibody 4-4-20. The amino acids 



substituted in the various mutants are given in Table 

? 4: Scatchard plld;t_x>f the fluorescence titration of 

fluorescein {20 • nM^HMwith antibody (4 to 800 nM) , 
measured at 510, nm. (the value r was obtained from (F- 
F 0 )/(F¥-F 0 )/ whe(rj^J>Vis the measured fluorescein 
fluorescence at *a giUprv^antibody concentration, F Q is 
the fluorescence In the absence of antibody and F^ when 
antibody is present inuarge excess. Note that r gives 
the saturation of fluorescein by antibody, (a) 
Titration of wt scFv, (b)\titration of Flu4 (V84D) . 

5: An overlay plot of the urea denaturation curves 

is shown. (X) wt scFv, (o) Flu4 . 

6: Thermal denaturation time courses at 40 and 44 °C 

for wt and Flu4 scFv fragment are shown, (a) wt scFv at 
40°C, (b) Flu4 at 40°C, (c) Flu4 at 44°C, (d) wt scFv 
at 44°C. I 

1: Sequence variability of residues contributing to the 
v/c interface: Residue statistics are based on the 
variable domain sequences in the Kabat database (March 
1996) . Sequences which were <90% complete were excluded 
from the analysis. Number of sequences analyzed: human 
VL kappa: 404 of 881, murine VL kappa: 1061 of 2239, 
human VL lambda: 223 of 409, murine VL lambda: 71 of 
206, human VH : 663 of 1756, murine VH: 1294 of 3849. 
Position refers to the sequence position according to 
Kabat et al. 1991, %exp. (Fab) to the relative side 
chain accessibility in an Fab fragment as calculated by 
the program NACCESS (NACCESS v2 . 0 by Simon Hubbard 
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(http: //www. biochem.ucLL .ac.uk/-roman/naccess/naccess .ht 
ml)), %exp. (ind.) to fthe relative side chain 
accessibility in the isolated VL or VH domain, %buried 
to the relative difference in side chain accessibility 
between Fv and Fab fragment. Consensus refers to the 
sequence consensus, aid Distribution to the 
distribution of residue types. 

Table 2: Mutations introduced in the scFv fragment of the 
antibody 4-4-20: Eachj line represents a different 
protein carrying the mutations indicated. The residues 
are numbered according to Kabat et al . (1991). 

Table 3: K D values of the different scFv mutants determined in 
fluorescence titration: The K D values are given in nM, 
the error was calculated from the Scatchard analysis 
(Fig. 4). # determired by Miklasz et al . (1995) 
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